8 research outputs found

    Bot recognition in a Web store: An approach based on unsupervised learning

    Get PDF
    Abstract Web traffic on e-business sites is increasingly dominated by artificial agents (Web bots) which pose a threat to the website security, privacy, and performance. To develop efficient bot detection methods and discover reliable e-customer behavioural patterns, the accurate separation of traffic generated by legitimate users and Web bots is necessary. This paper proposes a machine learning solution to the problem of bot and human session classification, with a specific application to e-commerce. The approach studied in this work explores the use of unsupervised learning (k-means and Graded Possibilistic c-Means), followed by supervised labelling of clusters, a generative learning strategy that decouples modelling the data from labelling them. Its efficiency is evaluated through experiments on real e-commerce data, in realistic conditions, and compared to that of supervised learning classifiers (a multi-layer perceptron neural network and a support vector machine). Results demonstrate that the classification based on unsupervised learning is very efficient, achieving a similar performance level as the fully supervised classification. This is an experimental indication that the bot recognition problem can be successfully dealt with using methods that are less sensitive to mislabelled data or missing labels. A very small fraction of sessions remain misclassified in both cases, so an in-depth analysis of misclassified samples was also performed. This analysis exposed the superiority of the proposed approach which was able to correctly recognize more bots, in fact, and identified more camouflaged agents, that had been erroneously labelled as humans

    Cost-oriented recommendation model for e-commerce

    Get PDF
    Contemporary Web stores offer a wide range of products to e-customers. However, online sales are strongly dominated by a limited number of bestsellers whereas other, less popular or niche products are stored in inventory for a long time. Thus, they contribute to the problem of frozen capital and high inventory costs. To cope with this problem, we propose using information on product cost in a recommender system for a Web store. We discuss the proposed recommendation model, in which two criteria have been included: a predicted degree of meeting customer’s needs by a product and the product cost

    Practical Aspects of Log File Analysis for E-Commerce

    Get PDF
    The paper concerns Web server log file analysis to discover knowledge useful for online retailers. Data for one month of the online bookstore operation was analyzed with respect to the probability of making a purchase by e-customers. Key states and characteristics of user sessions were distinguished and their relations to the session state connected with purchase confirmation were analyzed. Results allow identification of factors increasing the probability of making a purchase in a given Web store and thus, determination of user sessions which are more valuable in terms of e-business profitability. Such results may be then applied in practice, e.g. in a method for personalized or prioritized service in the Web server system

    Cost-oriented recommendation model for e-commerce

    Get PDF
    Contemporary Web stores offer a wide range of products to e-customers. However, online sales are strongly dominated by a limited number of bestsellers whereas other, less popular or niche products are stored in inventory for a long time. Thus, they contribute to the problem of frozen capital and high inventory costs. To cope with this problem, we propose using information on product cost in a recommender system for a Web store. We discuss the proposed recommendation model, in which two criteria have been included: a predicted degree of meeting customer’s needs by a product and the product cost

    Practical Aspects of Log File Analysis for E-Commerce

    Get PDF
    The paper concerns Web server log file analysis to discover knowledge useful for online retailers. Data for one month of the online bookstore operation was analyzed with respect to the probability of making a purchase by e-customers. Key states and characteristics of user sessions were distinguished and their relations to the session state connected with purchase confirmation were analyzed. Results allow identification of factors increasing the probability of making a purchase in a given Web store and thus, determination of user sessions which are more valuable in terms of e-business profitability. Such results may be then applied in practice, e.g. in a method for personalized or prioritized service in the Web server system

    A k-Nearest Neighbors Method for Classifying User Sessions in E-Commerce Scenario, Journal of Telecommunications and Information Technology, 2015, nr 3

    Get PDF
    This paper addresses the problem of classification of user sessions in an online store into two classes: buying sessions (during which a purchase confirmation occurs) and browsing sessions. As interactions connected with a purchase confirmation are typically completed at the end of user sessions, some information describing active sessions may be observed and used to assess the probability of making a purchase. The authors formulate the problem of predicting buying sessions in a Web store as a supervised classification problem where there are two target classes, connected with the fact of finalizing a purchase transaction in session or not, and a feature vector containing some variables describing user sessions. The presented approach uses the k-Nearest Neighbors (k-NN) classification. Based on historical data obtained from online bookstore log files a k-NN classifier was built and its efficiency was verified for different neighborhood sizes. A 11-NN classifier was the most effective both in terms of buying session predictions and overall predictions, achieving sensitivity of 87.5% and accuracy of 99.85%

    Propozycja wykorzystania informacji biznesowych w mechanizmie jakości usług dla serwera e-commerce

    No full text
    Due to very negative and long-term consequences of a low quality of service (QoS) for e-business, a number of QoS mechanisms for Web servers were proposed. As a continuation of this research trend, the paper proposes a new way of using business information in an admission control and scheduling scheme for the e commerce server aiming at the integration of the server system efficiency with e business profitability.Tematyka pracy dotyczy problemu jakości usług ośrodków webowych. Zaproponowano nowatorski sposób wykorzystania informacji biznesowych w metodzie kontroli przyjęć i szeregowania żądań dla serwisu e commerce. Celem metody jest połączenie aspektu wydajności serwisu webowego oraz rentowności elektronicznego biznesu

    W poszukiwaniu determinantów samopodobieństwa ruchu HTTP na serwerze www

    No full text
    The paper concerns the investigation of statistical self-similarity of HTTP traffic on a Web server and factors that may affect it. The Hurst parameter was estimated using three various methods, a degree of Web traffic burstiness was determined by computing so-called burstiness parameters and then the correlation of the mean Hurst parameter with the burstiness parameters and other traffic features was examined.Artykuł dotyczy badania statystycznego samopodobieństwa ruchu HTTP na serwerze WWW, a także czynników, które mogą mieć na nie wpływ. Oszacowano parametr Hursta trzema różnymi metodami, wyznaczono stopień wybuchowości ruchu, obliczając tzw. parametry wybuchowości, a następnie obliczono korelację średniej wartości parametru Hursta z parametrami wybuchowości i innymi cechami ruchu
    corecore